Two Dimensional Evaluation Reinforcement Learning
نویسندگان
چکیده
To solve the problem of tradeo between exploration and exploitation actions in reinforcement learning, the authors have proposed two-dimensional evaluation reinforcement learning, which distinguishes between reward and punishment evaluation forecasts. The proposed method uses the di erence between reward evaluation and punishment evaluation as a factor for determining the action and the sum as a parameter for determining the ratio of exploration to exploitation. In this paper we described an experiment with a mobile robot searching for a path and the subsequent con ict between exploration and exploitation actions. The results of the experiment prove that using the proposed method of reinforcement learning using the two dimensions of reward and punishment can generate a better path than using the conventional reinforcement learning method.
منابع مشابه
Realworld Robot Navigation by Two Dimensional Evaluation Reinforcement Learning
The trade-off of exploration and exploitation is present for a learnig method based on the trial and error such as reinforcement learning. We have proposed a reinforcement learning algorism using reward and punishment as repulsive evaluation(2D-RL). In the algorithm, an appropriate balance between exploration and exploitation can be attained by using interest and utility. In this paper, we appl...
متن کاملExploratory Gradient Boosting for Reinforcement Learning in Complex Domains
High-dimensional observations and complex realworld dynamics present major challenges in reinforcement learning for both function approximation and exploration. We address both of these challenges with two complementary techniques: First, we develop a gradient-boosting style, nonparametric function approximator for learning on Q-function residuals. And second, we propose an exploration strategy...
متن کاملGaussian Processes in Reinforcement Learning
We exploit some useful properties of Gaussian process (GP) regression models for reinforcement learning in continuous state spaces and discrete time. We demonstrate how the GP model allows evaluation of the value function in closed form. The resulting policy iteration algorithm is demonstrated on a simple problem with a two dimensional state space. Further, we speculate that the intrinsic abili...
متن کاملEvaluation of Ultimate Torsional Strength of Reinforcement Concrete Beams Using Finite Element Analysis and Artificial Neural Network
Due to lack of theory of elasticity, estimation of ultimate torsional strength of reinforcement concrete beams is a difficult task. Therefore, the finite element methods could be applied for determination of strength of concrete beams. Furthermore, for complicated, highly nonlinear and ambiguous status, artificial neural networks are appropriate tools for prediction of behavior of such states. ...
متن کاملParallel Recombinative Reinforcement Learning: A Genetic Approach
A technique is presented that is suitable for function optimization in high-dimensional binary domains. The method allows an efficient parallel implementation and is based on the combination of genetic algorithms and reinforcement learning schemes. More specifically, a population of probability vectors is considered, each member corresponding to a reinforcement learning optimizer. Each probabil...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001